AI & Data Literacy by Bill Schmarzo
Author:Bill Schmarzo
Language: eng
Format: epub
Publisher: Packt
Published: 2023-11-15T00:00:00+00:00
Understanding probabilities and statistics
Making predictions about likely outcomes is a challenging task. As famously stated by Yogi Berra, âItâs tough to make predictions, especially about the future.â Accurate predictions rely on a nuanced understanding of probabilities, confidence levels, and confidence intervals.
Probability is a measure of the likelihood that a particular event will occur, typically expressed as a percentage (ranging from 0% to 100%). For example, examining Barry Bondsâ 2004 season with the San Francisco Giants, we can calculate the probability of him getting a hit as 36.2% (equivalent to 36.2 hits for every 100 at-bats).
Understanding probabilities is vital for assessing the likelihood of specific outcomes, equipping us with the necessary insights to make informed decisions. It is crucial to acknowledge that probabilities serve as estimates derived from available data and statistical analysis. While probabilities provide a framework for evaluating relative likelihoods, it is important to remember that they do not guarantee definitive outcomes. Therefore, to enhance the effectiveness of our predictions, it becomes imperative to harness the power of statistics.
Statistics is the practice or science of collecting and analyzing numerical data in large quantities, especially to infer proportions as a whole from those in a representative sample. By leveraging statistical techniques, we can analyze patterns, identify correlations, and uncover valuable insights that enable us to make more accurate and reliable predictions.
When using statistics to help us calculate probabilities and make predictions, we need to understand the statistical concepts of the mean (or average), variance, standard deviation, confidence intervals, and confidence levels. These are basic statistical concepts that everyone needs to understand in order to leverage statistics to make more informed decisions. Letâs define these basic concepts:
The mean or average is the sum of a collection of numbers divided by the count of numbers in the collection.
Variance measures the variability of the numbers or observations from the average or the mean of that same set of numbers or observations. Variance measures how dispersed the data is for the mean.
Standard deviation is simply the square root of the variance. A low standard deviation means data is clustered around the mean, and a high standard deviation indicates data is more spread out. A standard deviation near zero indicates that data points are close to the mean. In contrast, a high or low standard deviation indicates that data points are respectively above or below the mean.
The confidence interval is the range of values you expect your estimate to fall between for a certain percentage of the time if you rerun your experiment or re-sample the population similarly.
The confidence level is the percentage of time you expect to reproduce an estimate between the upper and lower bounds of the confidence interval.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Access | Data Mining |
Data Modeling & Design | Data Processing |
Data Warehousing | MySQL |
Oracle | Other Databases |
Relational Databases | SQL |
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(7865)
Learning SQL by Alan Beaulieu(5428)
Weapons of Math Destruction by Cathy O'Neil(5052)
Big Data Analysis with Python by Ivan Marin(3109)
Blockchain Basics by Daniel Drescher(2900)
Building Statistical Models in Python by Huy Hoang Nguyen & Paul N Adams & Stuart J Miller(2670)
Azure Data and AI Architect Handbook by Olivier Mertens & Breght Van Baelen(2641)
Serverless Machine Learning with Amazon Redshift ML by Debu Panda & Phil Bates & Bhanu Pittampally & Sumeet Joshi(2576)
Hands-On Machine Learning for Algorithmic Trading by Stefan Jansen(2546)
Pandas Cookbook by Theodore Petrou(2509)
Mastering Python for Finance by Unknown(2494)
Data Wrangling on AWS by Navnit Shukla | Sankar M | Sam Palani(2359)
How The Mind Works by Steven Pinker(2225)
Driving Data Quality with Data Contracts by Andrew Jones(2222)
Data Engineering with dbt by Roberto Zagni(2175)
Building Machine Learning Systems with Python by Richert Willi Coelho Luis Pedro(2061)
Machine Learning Model Serving Patterns and Best Practices by Md Johirul Islam(2019)
Network Science with Python and NetworkX Quick Start Guide by Edward L. Platt(2012)
Python Natural Language Processing by Jalaj Thanaki(1896)